Karaganda Region
KazQAD: Kazakh Open-Domain Question Answering Dataset
Yeshpanov, Rustem, Efimov, Pavel, Boytsov, Leonid, Shalkarbayuli, Ardak, Braslavski, Pavel
We introduce KazQAD -- a Kazakh open-domain question answering (ODQA) dataset -- that can be used in both reading comprehension and full ODQA settings, as well as for information retrieval experiments. KazQAD contains just under 6,000 unique questions with extracted short answers and nearly 12,000 passage-level relevance judgements. We use a combination of machine translation, Wikipedia search, and in-house manual annotation to ensure annotation efficiency and data quality. The questions come from two sources: translated items from the Natural Questions (NQ) dataset (only for training) and the original Kazakh Unified National Testing (UNT) exam (for development and testing). The accompanying text corpus contains more than 800,000 passages from the Kazakh Wikipedia. As a supplementary dataset, we release around 61,000 question-passage-answer triples from the NQ dataset that have been machine-translated into Kazakh. We develop baseline retrievers and readers that achieve reasonable scores in retrieval (NDCG@10 = 0.389 MRR = 0.382), reading comprehension (EM = 38.5 F1 = 54.2), and full ODQA (EM = 17.8 F1 = 28.7) settings. Nevertheless, these results are substantially lower than state-of-the-art results for English QA collections, and we think that there should still be ample room for improvement. We also show that the current OpenAI's ChatGPTv3.5 is not able to answer KazQAD test questions in the closed-book setting with acceptable quality. The dataset is freely available under the Creative Commons licence (CC BY-SA) at https://github.com/IS2AI/KazQAD.
- Asia > Russia (0.14)
- North America > United States (0.14)
- Asia > Kazakhstan > Akmola Region > Astana (0.04)
- (20 more...)
- Research Report (0.64)
- Overview (0.46)
- Education (1.00)
- Information Technology (0.88)
- Leisure & Entertainment > Sports (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.86)
Space capsule with astronauts returns safely to Earth
Three astronauts have landed back on Earth after nearly six months on board the International Space Station. A Russian Soyuz capsule containing Nasa's Randy Bresnik, Russia's Sergey Ryazanskiy and Paolo Nespoli of the European Space Agency descended under a red-and-white parachute and landed on schedule at 2:37pm local time on the vast steppes outside of a remote town in Kazakhstan. The three were extracted from the capsule within 20 minutes and appeared to be in good condition. The spacecraft brought back Randy Bresnik from the U.S. National Aeronautics and Space Administration, Sergey Ryazanskiy from Russian space agency Roscosmos, and Italy's Paolo Nespoli with the European Space Agency. The capsule landed in the windswept and snow-covered steppe in Kazakhstan's central Karaganda region at 2.37 p.m. (0837 GMT).
- Government > Space Agency (1.00)
- Government > Regional Government > North America Government > United States Government (0.82)